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Abstract 

We explore further the discovery potential for heavy quarks at the LHC, with emphasis on 
the t' and h' of a sequential fourth family associated with electroweak symmetry breaking. We 
consider QCD multijets, tt + jets, W + jets and single t backgrounds using event generation 
based on improved matrix elements and low sensitivity to the modeling of initial state radiation. 
We exploit a jet mass technique for the identification of hadronically decaying VF's and t's, 
to be used in the reconstruction of the t' or b' mass. This along with other aspects of event 
selection can reduce backgrounds to very manageable levels. It even allows a search for both t' 
and h' in the absence of 6-tagging, of interest for the early running of the LHC. A heavy quark 
mass of order 600 GeV is motivated by the connection to electroweak symmetry breaking, but 
our analysis is relevant for any new heavy quarks with weak decay modes. 

1 Introduction 

The mystery of the rephcation of famihes is part of the flavor problem. But unhke the flrst three 
famihes, a possible fourth family may have a more easily understood role to play. Fourth family 
fermions with masses above about 550 GeV would couple strongly to the Goldstone bosons of elec- 
troweak symmetry breaking [1] . This is another way of saying that these fermions are involved with 
the strong dynamics of electroweak symmetry breaking. To put it even more strongly, these fermion 
masses are then the natural order parameters for electroweak symmetry breaking. Meanwhile the 
fourth family may be the last sequential family and in this way complete the flavor structure of the 
theory. The joining of these two issues, the flavor problem and electroweak symmetry breaking, is 
a prime motivation to consider the fourth family. 

Strong interactions, rather than a Higgs, would unitarize WW scattering. But given some 
unknown strong interactions it remains to determine the massive propagating degrees of freedom 
that most strongly affects this scattering of Goldstone bosons. For example, for the scattering of 
pseudo-Goldstone bosons of QCD it is the p. For most theories of electroweak symmetry breaking 
it is also a boson, either scalar or vector. Instead we are proposing that the propagating degrees of 
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freedom are fermions. This requires that the strong interactions break chiral symmetries without 
confining the massive fermions. This is reminiscent of the old NJL model, which then forms the 
basis for a bottom up description of the eff'ective dynamics. In fact in the absence of the Higgs 
boson, effective four fermion interactions must also be responsible for feeding mass from the heavy 
fermions to lighter quarks and leptons. The size of such operators are determined by inverse powers 
of a new mass scale, the scale of flavor physics, which therefore cannot be that far removed from 
the electroweak scale. Thus a fourth family would not only recast the flavor problem, but it would 
also force us to conclude that the scale of flavor physics is nearby. 

We note that the main effect of a light Higgs boson in electroweak precision data is to shift the 
value of the T parameter by a positive amount. If there is no light Higgs, then something else must 
produce a positive AT. But the mass splitting in the heavy quark doublet does just that. For a 
more detailed analysis that shows how the S and T constraints can be satisfled through appropriate 
masses for the fourth family quark and leptons see [2]. That reference also relates the t mass to a 
contribution to the heavy quark mass splitting, with the implication that m^' > m^/. This result is 
based on an analysis of the approximate symmetries of operators that may be necessary to account 
for the t mass while remaining consistent with other constraints such as the Zbb coupling. See the 
appendix for a brief summary of that argument. 

Assuming some CKM mixing between the third and fourth families, one or both of the following 
processes should be important. 

pp ^ t'f ^ W+W-bb (1) 
pp b'V -> W+W'tt (2) 

If my and mf differ by more than the W mass then certainly the heavier of t' or b' will decay 
into the lighter, and only one of these processes will be important. If my and m^/ differ by less 
than the W mass then the mass splitting and the value of the CKM mixing angles will determine 
the importance of transitions between t' and b' involving virtual VT's. For example if mf = 600 
GeV and my = (670, 650, 630) GeV then the rate for b' Wt will be comparable to or dominate 
b' W^*h' for a mixing angle > (.01, .04, .001) respectively. Thus we see how process ([2]) could 
still be important even if we are correct about my > mf. 

In our previous work [3] we developed a search strategy for process ([1]), where we used the 
invariant mass of single jets to identify the hadronic decays of W^s. This method has been studied 
in [1] and in cases [H [6] like ours where the W^s in the signal events are both well boosted and 
isolated. The jet invariant mass distribution for the signal events has a strong peak close to m^y for 
an appropriate cone size in the jet flnding algorithm. A VT-jet is deflned to be a jet with invariant 
mass within ^ 10 GeV of mw- It is seen in [3] that this method is signiflcantly less efficient 
at identifying the less isolated W^s of the main irreducible background, ti production. Thus in 
comparison to a more traditional search for heavy quarks [7] where the two jets in W ^ jj are 
identified, an enhancement of the signal to background ratio S/ B in the reconstruction the t' mass 
is obtained. 

In the case of process ([2]) we shall explore the use of the jet invariant mass technique to identify 
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both W^s and t's [H [6] through their hadronic decays. And for both processes ([T]) and ([2]) we 
shall consider a major background not considered in [3], that of QCD multijets. This background 
requires a more restrictive event selection. An interesting consequence of these new constraints is 
that they make possible an effective search without 6-tagging. 

One measure of the background to new heavy particle production is the size of the high energy 
tail of the Ht distribution (scalar sum of everything in the event including missing energy). The 
Ht tail is sensitive to initial state radiation, and thus the modeling of ISR in event generators is an 
important factor in background estimation. For example in stand-alone Pythia[H] and in the case of 
ti production, the default setting has a high cutoff on the phase space of the ISR ( "power showers 
[9]" ) in order to obtain realistic pt distributions of the hardest extra jets. On the other hand realistic 
Pt distributions of extra jets will also arise from the use of the appropriate perturbative matrix 
elements, either those that are beyond lowest order at tree level (Alpgen[TO]). and/or those that 
are next-to- leading-order at one- loop (MC@NLO[lI]). We found [3] that the generators MC@NLO- 
Herwig[T2], Alpgen-Herwig and Alpgen-Pythia were in good agreement in their results for the high 
Ht tail. In comparison stand-alone Pythia with p^-ordered power showers significantly inflates the 
high Ht tail of ti production. (Stand-alone Herwig produced a similar inflation.) The reason for 
this is that the high cutoff relaxes the relation between different contributions to Hj-- In particular 
less energy in the ti system, where the partonic cross section is larger, can be made up by the 
energy of jets from ISR. The improved matrix elements on the other hand more strongly constrain 
the relative amounts of energy in ti versus the extra jets. 
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Figure 1: Overlayed histograms show the lack of sensitivity of Alpgen-Pythia to Pythia's modeling 
of initial state radiation in ti production, when compared to stand-alone Pythia. The QW tune is 
used for both, while Pt^^ is an Alpgen jet parameter. 

Let us further compare Alpgen-Pythia to stand-alone Pythia, where both are using the same 
Pythia tune (as described below) with power showers turned off (MSTP(68)=0). In Fig. (1) we show 
the Ht tails for ti production, with and without initial state radiation (MSTP(61)=1 or 0). The very 
large sensitivity to ISR that is apparent in Pythia is drastically reduced in Alpgen-Pythia. This 
implies that the MLM jet-parton matching scheme [15] of Alpgen is very efficient at vetoing any 
extraneous hard ISR from Pythia, beyond that implied by the improved matrix elements of Alpgen. 
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With respect to the Alpgen jet parameter p^*", the shghtly greater sensitivity at = 100 rather 
than p^*"- = 75 GeV arises from the ti + jet sample, where the latter becomes more susceptable 
to Pythia's handling of ISR for larger p™". Also by comparing the two results of Alpgen-Pythia 
that include ISR we see very little sensitivity to the choice of p™", which is a further check of the 
jet-parton matching scheme. 

From this we are encouraged to use Alpgen-Pythia exclusively in the estimation of background. 
Alpgen currently does not have a user interface for new physics models, and therefore we will use 
Madgraph [1 3] -Pythia exclusively for signal generation. We will use the CTEQ6.1 PDF consistently 
within Alpgen-Pythia and Madgraph-Pythia. This PDF more accurately represents the gluon struc- 
ture function, which is stronger at the relevent x than given by CTEQ5L. CTEQ6.1 is a NLO PDF, 
while Alpgen-Pythia only goes part way towards NLO. But if and when NLO matrix elements are 
introduced it is instructive to see the effect of this while keeping everything else the same, including 
the PDF. For example the cross section for the production of t't' or b'b' (with 600 GeV masses) 
increases from ^ 0.9 to ~ 1.4 pb due to the effect of the NLO matrix elements from MC@NLO. 
But such enhancements, the K-factors, affect both signal and background and in our previous work 
we found that Alpgen without K-factors produced a signal to background ratio very similar to 
MC@NLO. 

We will adopt the QW Pythia tune [H] which is basically the popular DW tune adapted to the 
CTEQ6.1 PDF; only the PARP(82) value is changed. For the renormalization/factorization scale we 
always choose We note that S/B has little sensitivity to this choice; both signal and back- 

ground cross sections decrease by nearly identical amounts, about 25%, when the renormalization 
scale is increased to y/S. 

2 t't' production and backgrounds 

We use the PGS4 detector simulator [16] with the ATLAS default set of parameters (from the 
Madgraph package) and with trigger selections turned off. We use the cone based jet finder with a 
cone size of 0.6. We replace the b tag/mistag efficiencies in PGS4 by (1/2, 1/10, 1/30) (for \ri\ < 2 
and vanishing above) for underlying 6's, c's and gluons/hght quarks respectively. Our choices should 
be more appropriate given the high p^'s of the 6-jets. 

For our study of t't' — > WWbb in [3] our focus was on the ti + jets and W + jets backgrounds. 
The event selection included a lower bound Ah on the Ht of the event0 and a lower bound of Ab 
on the pt of a ^-tagged jet. We found that Ah = 2mt' and A;, = mf /3 worked well, and we will fix 
rrit' = 600 Gev|§ We require one H^-jet, defined by having an invariant mass within 9 GeV of mw. 
In the t' mass reconstruction we consider all pairs of identified W and b jets in each event, where 
for all such pairs we require an "angular" separation AR < 2.5. We also veto any event with a jet 
having |?7| > 2.5 and pt > 200 GeV. 

^In [3] we used the scalar pT sum of the five hardest objects, which gives similar results. 
^In [3] we also considered rrifi = 800 GeV. 
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Our study here will include the QCD multijet background and to adequately suppress this the 
event selection needs to be tightened further. Thus far we have required one VT-jet, but now we 
must require the leptonic decay of the other W and accept the loss of ~ 80% of the signal. The 
requirement for isolated leptons and/or missing energy fortunately causes an even more drastic 
reduction of the multijet background. We consider a loose and a tight cut. 

loose: isolated lepton or missing energy in excess of 250 GeV 

tight: isolated lepton and missing energy in excess of 30 GeV 

The isolateclfl leptons (electron or muon) also have px > 20 GeV. 

These new constraints along with the jet mass technique are together so effective that they allow 
us to treat the 6-tagging of a jet as optional. Thus our analysis will be done with and without 6-tags, 
where in the latter case we maintain the pr > 200 GeV constraint on the jet that is combined with 
the VT-jet in the t' mass reconstruction. One motivation for eliminating the fe-tag is to cover the 
possibility that CKM mixing is such that t't' (or b'b') W^W~qq is important, where g is a light 
quark. Another motivation is that 6-tagging, especially at high pr, may not be very efficient in the 
early running of the LHC. 

For the + jets background we have Alpgen generate samples for 0, 1 and 2 extra hard partons, 
using the MLM jet-parton matching scheme. The maximum jet pseudorapidity and the minimum 
jet separation are set to 2.5 and 0.7 respectively. We choose p^**^ = 100 GeV for the Alpgen jet 
definition; with this choice the tt + \ jet sample dominates both the exclusive + jet sample and 
the inclusive + 2 jet sample in the signal region. More precisely it dominates on the high Ht 
tails as shown in Fig. (2a). We are thus ensured that the Alpgen generated matrix elements are 
controlling the bulk of the showering. 
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Figure 2: Overlayed histograms show the relative sizes of the jet multiplicity samples on the high 
Ht tail, for a) + jets with p^" = 100 GeV (left) and b) + jets with = 150 GeV (right). 

For the 14^+ jets background we have Alpgen generate the 14^ + 1, W + 2, and 14^ + 3 jet samples, 
where we allow the W to decay inclusively. Here we use p^*" = 150 GeV, and we display the 



^The lepton and muon isolation cuts are those described in [T7] . 
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relative contributions on the high Ht tail in Fig. (2b). We also consider the single top production 
process pp — > {t/t){h/h)W since it represents another irreducible background. This is also modeled 
in Alpgen with p™" = 150 for the 6-jet. We shall see that this latter background small, and other 
backgrounds such as 66+jets, Z+jets, (W/Z)bb, (WW/ZZ/WZ) +jets are even more insignificant. 

Potentially more serious is the QCD multijet background. Since the cross sections are so large, 
it becomes nontrivial for event generators to generate sufficient integrated luminosity to make the 
background estimate. Here Alpgen again proves helpful since it allows the exclusive 2-jet sample, 
with its enormous cross section, to be separated out. We will like the 3-jet sample to dominate the 

2- jet sample in the signal region, and this occurs if p™" is not too large. On the other hand by 
increasing we can reduce the cross sections, with the reductions proportionally greater for the 
higher jet multiplicities. A compromise is to take pjp^" = 200 GeV. This is small enough so that 
the jets in the 2-jet sample satisfying the Ah cut are mostly back-to-back so that their combined 
invariant mass is typically much larger than m^/, thus removing them from the signal region. Also 
this value of p^^^ is large enough so that the 4-jet sample, the inclusive sample, is smaller than the 

3- jet sample in the signal region. 

The exclusive 2-jet contribution still has an enormous cross section, and so to explore its effect 
we temporarily drop the lepton/missing energy requirements and the 6-tagging. We still require a 
jet with pt > Af, to form an invariant mass when combined with a VT-jet; then we can compare the 
2-jet with the sum of the 3-jet and 4-jet samples in the signal region of the t' reconstruction plot. 
It is easier to generate sufficient events under these conditions and we find that the 2-jet sample is 
roughly 1/2 as large. Thus we can concentrate on generating sufficient integrated luminosity of the 
3 and 4-jet samples with leptons/missing energy/6-tagging constraints reinstated, and ignore the 
2-jet sample, with the knowledge that the 2-jet sample contributes no more than another 50%. This 
possible additional 50% is certainly an overestimate, since the likelihood of mis-identified leptons 
or fake missing energy will be less for the 2-jet sample than for the higher multiplicity samples. 
In fact for the (small) integrated luminosity that we have generated for the 2-jet sample, none of 
the events survive on the t' mass reconstruction plots. On the plots to follow we do not make any 
correction for the neglected 2-jet sample. 

We show the signal and the various backgrounds as stacked histograms on the t' mass recon- 
struction plots in Fig. (3), where the two plots are for the loose and tight lepton/missing energy 
constraints. We see the successful suppression of the multijet background to almost insignificant 
levels. Without the lepton/missing energy constraints, the multijet background would be several 
times higher than the height of the signal peak. We also note that the fall-off of the backgrounds 
for large invariant mass Mwj is controlled by our constraint on ^R\Yj. 
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Figure 3: The signal from t't' W^W^hb compared to various backgrounds with 6-tagging, where 
the loose and tight cuts refer to the isolated lepton and/or missing energy requirements. The various 
contributions, including the signal, have been stacked (not overlayed). 

This strength of signal to background encourages us to consider results without the 6-tag, as 
shown in Fig. (4). The multijet background remains small while the W + jets background becomes 
substantially more important. Nevertheless we see that the discovery potential for the heavy quarks 
is still quite attractive without 6-tagging, thus providing an opportunity in the early running of the 
LHC before 6-tagging methods are well developed. 




3 h'h' production and backgrounds 

If h' is larger than the t' mass by more than the W mass, then the following process will occur. 

pp b'F W+W-ff W+W-W+W-bb 

This basically increases the signal discussed in the last section. Two of the W^s will be relatively 
soft since the heavy quark mass splitting cannot be too large. The leptonic decays of these W^s will 
add to the likelihood of observing isolated leptons, thus enhancing this contribution to the signal. 
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For rribr = 700 and rrit' = 600 GeV we find that b'b' production has a cross section about 40% that 
of t't' production. We compare the two contributions to the signal in Fig. (5). 




Figure 5: The signal appearing in Figure (4) is shown in isolation, along with the additional signal 
that would arise from b'b' production when rrihi = 700 GeV. 

We now consider the process of interest if b' Wt is the dominant decay mode of b': 

pp b'b W+W-fi 

Here we set nif,' = 600 GeV. If rrit' is sufficiently larger than m^' then this signal is enhanced further 
through 

pp t'U W^W'b'V ^ W^W-W^W-tt, 

(and the signal of the last section disappears) but we will ignore this in the following. Our object 
will be to explore the feasibility of using single jet invariant masses to identify both the t and the 
W through their hadronic decays, and from them reconstruct the b' mass. 

A drawback is that the cone size that is optimal to identify W jets is not optimal to identify the 
t jet, since a significantly larger cone size is necessary to capture the three proto-jets of a boosted t 
decayfl Our compromise, not optimal for either identification, is the choice of 0.8 for the cone size. 
A H^-jet is defined by an invariant mass within 9 GeV of m^y as before. For the t the associated 
invariant mass peak in the signal events is broad and not nearly as strong as the W peak. Thus we 
make a loose definition of a t-jet as a jet with invariant mass greater than 100 GeV and px > 300 
GeV. We use the same loose and tight lepton/missing energy constraints as described before. The 
only other difference is to tighten the upper bound on AR between the t and W jets to 2.0. 

For the QCD jet background we use Alpgen to generate the 2, 3, and 4-jet samples as before. 
Again none of the 2-jet sample actually generated survives as background to the 6'-mass recon- 
struction. We can also bound the possible contribution of a 2-jet sample as before by removing 
the lepton/missing energy requirements, in which case the 2-jet sample is about 1/4 the size of the 

''The jet mass technique was used in [5] to identify t's from the decay of vector-hke quarks more massive than 
ours, so that the t's were more strongly boosted. 
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3 + 4-jet sample in the signal region. Once again we make no correction for dropping the 2-jet 
sample. 

Using this event selection we produce the results for the b' mass reconstruction in Fig. (6). 
Although the signal size is hampered as we have described, we note that the background reduction 
appears again to be very effective. And it is again of interest to note that our use of the jet mass 
technique has made possible a search for b' without the use of 6-tags. 




4 Conclusion 

We believe that our search strategy for new heavy quarks at the LHC improves on the more 
traditional analysis modeled after the t quark discovery at Fermilab. A key role is played by the 
jet mass technique to identify W^s (and t's) through their hadronic decays, which in the case of 
t't' production acts to suppress the main irreducible background from ti production. We have 
found that among the various background processes, only the QCD multijet background forces 
requirements for isolated leptons and/or missing energy. But with these requirements the search 
for both the t' and b' can be undertaken without the use of 6-tagging. 

For the actual estimation of backgrounds we have found Alpgen-Pythia to be useful, both to 
avoid the excessive sensitivity of stand-alone Pythia to the modeling of initial state radiation, and in 
the estimation of the multijet background. It is possible that our use of the fast detector simulator 
PGS4 could be leading to an overly optimistic estimate of the background reduction. A full detector 
simulation is certainly warranted, especially with regard to the efficiency of isolated lepton and 
missing energy constraints on event selection. Nevertheless the strong signal to background results 
that we have exhibited provides reason to believe that fourth family quarks could be discovered in 
the early running of the LHC. 
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5 Appendix 



A serious issue for a model of dynamical electroweak symmetry breaking is the generation of the large 
t mass in a manner compatible with electroweak precision measurements. We briefly summarize 
an argument [21 [18] based on approximate symmetries of effective operators that suggests a way 
out. Since we are interested in approximate symmetries that can constrain operators that generate 
mass and/or feed down mass, the approximate symmetries should be axial-like. For the third and 
fourth family quarks (g^, g^, qi, Qr) with q' = (t', h') and q = {t, b), there are two such axial-charge 
generators to consider: Q: (+, — , — , +) and Q: (+, — , +, — ). 

We then categorize some effective operators of interest in terms of the charges they carry, where 
all may be written in an SU{2)l x U{1) invariant manner. 

1) t'^t'j^'j^t'j^ '^l^'i^r^'l (neutral under both charges) 

2) t'^t'j^tRtL tih'j^RhL (charged under Q) 

3) bib'titLtR t'^t'j^bLbR (charged under Q) 

Operators of type 1 and 2 are such that they can be generated by gauge boson exchange, while the 
type 3 operators with their LRLR structure cannot be. Type 1 operators represent the dynamics 
generating mass for the t' and b' while type 2 and 3 operators can feed mass from the fourth family 
to the third family quarks. Type 2 operators are usually considered for this task. The trouble 
is that this set of operators includes other operators that are dangerous, in particular those that 
contribute to the T parameter and the Zbb vertex. In particular it is nontrivial to arrange gauge 
boson exchanges to generate the t mass while not also generating unwanted effects [20] . 

Type 2 operators are all suppressed if Q corresponds to a good approximate symmetry. If Q is 
more badly broken than Q then the t mass can instead arise from an operator of type 3. We will 
refer to bip'^itR as the t-mass operator. The 6-mass on the other hand can come either from the 
accompanying operator in class 3 (related by a SU{2)ii transformation of the t-mass operator) or 
from an operator of the suppressed class 2. In either case we see that the nonperturbative dynamics 
responsible for type 3 operators must badly break SU{2)r. 




Figure 7: Effects arising from two insertions of the t-mass operator. 

The mere existence of the t-mass operator (its partner i^ib'^L^R by SU{2)l symmetry is implicit) 
implies that some operators of class 2 will be generated. But since class 2 operators are Q invariant, 
two insertions of the t-mass operator are necessary. The resulting effects are thus suppressed by 
(mt/mf)^ and a loop factor. One example is the Zbb vertex correction in Fig. (7a). Another is the 
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correction to the h' mass in Fig. (7b), which is not shared by the t' mass. This is the origin of the 
expectation that my > mti. 
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